Fast parallel CRC algorithm and implementation on a configurable processor

نویسندگان

H. Michael Ji

Earl Killian

چکیده

-In this paper we present a fast cyclic redundancy check (CRC) algorithm that performs CRC computation for any length of message in parallel. For a given message with any length, we first chunk the message into blocks, each of which has a fixed size equal to the degree of the generator polynomial. Then we perform CRC computation among the chunked blocks in parallel using Galois Field multiplication and accumulation (GFMAC). Theoretically our fast parallel CRC algorithm can achieve unlimited speedup over the bit-serial algorithm or byte-wise table lookup algorithm at the expense of adding enough GFMAC units. Our algorithm can perform CRC computation for any lengthy message with 2 to 3 clock cycles. In practice, we choose to use a configurable processor where a customized instruction is added to perform multiple pairs of GF multiplication and accumulation. For example, a 4-GFMAC implementation can compute a 32-bit CRC in 2 to 3 cycles for a 16-byte message. This level of performance is hundreds of times faster than bit-serial CRC algorithm or tens of times faster than byte-wise parallel CRC algorithm. The generator polynomial can be chosen to be software programmed or hard-coded. Our algorithm adds only a small number of logical gates to the processor core.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Cellular Automata Implementation on Graphic Processor Unit (GPU) for Salt and Pepper Noise Removal

Noise removal operation is commonly applied as pre-processing step before subsequent image processing tasks due to the occurrence of noise during acquisition or transmission process. A common problem in imaging systems by using CMOS or CCD sensors is appearance of the salt and pepper noise. This paper presents Cellular Automata (CA) framework for noise removal of distorted image by the salt an...

متن کامل

Optimizing Matrix-matrix Multiplication for an Embedded Vliw Processor

The optimization of matrix-matrix multiplication (MMM) performance has been well studied on conventional general-purpose processors like the Intel Pentium 4. Fast algorithms, such as those in the Goto and ATLAS BLAS libraries, exploit common microarchitectural features including superscalar execution and the cache and TLB hierarchy to achieve near-peak performance. However, the microarchitectur...

متن کامل

A Low-Power High Throughput Configurable FFT/IFFT Processor for WLAN and WiMax Protocols

This paper presents a configurable Fast Fourier Transform (FFT) processor targeting the IEEE 802.11n (WLAN) and the IEEE 802.16 (WiMax) wireless protocols. Such processor is based upon the Radix-2 SinglePath Delay Feedback (R2SDF) architecture and can be configured to operate on 64/128/512/1024/2048-point sequences. It was synthesized for a 90nm commercial standard-cells library by using Synops...

متن کامل

A Coarse-Grain Hierarchical Technique for 2-Dimensional FFT on Configurable Parallel Computers

FPGAs (Field-Programmable Gate Arrays) have been widely used as coprocessors to boost the performance of data-intensive applications [1][2]. However, there are several challenges to further boost FPGA performance: the communication overhead between the host workstation and the FPGAs can be substantial; large-scale applications cannot fit in a single FPGA because of its limited capacity; mapping...

متن کامل

An Effective Hybrid Genetic Algorithm for Hybrid Flow Shops with Sequence Dependent Setup Times and Processor Blocking

Hybrid flow-shop or flexible flow shop problems have remained subject of intensive research over several years. Hybrid flow-shop problems overcome one of the limitations of the classical flow-shop model by allowing parallel processors at each stage of task processing. In many papers the assumptions are generally made that there is unlimited storage available between stages and the setup times a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Fast parallel CRC algorithm and implementation on a configurable processor

نویسندگان

چکیده

منابع مشابه

Fast Cellular Automata Implementation on Graphic Processor Unit (GPU) for Salt and Pepper Noise Removal

Optimizing Matrix-matrix Multiplication for an Embedded Vliw Processor

A Low-Power High Throughput Configurable FFT/IFFT Processor for WLAN and WiMax Protocols

A Coarse-Grain Hierarchical Technique for 2-Dimensional FFT on Configurable Parallel Computers

An Effective Hybrid Genetic Algorithm for Hybrid Flow Shops with Sequence Dependent Setup Times and Processor Blocking

عنوان ژورنال:

اشتراک گذاری